ISSN 0281-9864 Remarks on the Status of Inference in the Area of Knowledge Representation Christopher Habel KOGNITIONSVETENSKAPLIG FORSKNING Cognitive Science Research Lund University Sweden Remarks on the Status of Inference tn the Area of ■■ Knowledge Representation Christopher Habel 1987 No. 20 Communications should be sent to: Bernhard Bierschenk Department of Psychology Paradisgatan 5 Lund University S-22350 Lund, Sweden Abstract The concept of inference Is one of the global concepts used for the explanation of cognitive processes. There. exist mainly two types of characterization: the logical and the psychological. These different characterizations are based on the difference between inference and rule of inference . Information processing systems can be formalized as inferential systems, i.e. systems with inferential processes. The fundamental concepts of this formalization, those of dynamical inferential systems and time-restricted derivations, both based on inferential processes are described in detail. '■'-'■ ' Remarks on the status of Inference in the area of Knowledge Representation i» 1. Preliminaries on Cognitive Science One of the fundamentals of Cognitive Science (CS) - and thus of Cognitive Psychology and Ari:ificial Intelligence (Al) - is the view of minds as information processing systems (IPS), cp. Haugeland (1978). The strongest, i.e. most far-reaching, version of Cognitivism can be characterized by Newell's (1980) Physical Symbol System Hypothesis "... humans are instances of physical symbol systems, and by virtue of this, mind enters into the physical universe." (p. 136) Thus the subject in Cognitive Science is the inquiry of information processes from different points of view and with different goals. Considering the whole spectrum from Cognitive Psychology to applied Artificial Intelligence this leads to such different goals as the explanation of processes in the human mind, on one hand, and the use of information processes in applied Al-systems, on the other hand. (The current state of the art is dominated by a strategy of investigation, which Church! and (1984; p. 106) characterizes - w.r.t. Al - as the "piecemeal approach": only very specific processes are investigated; a global view of the whole phenomenon of human intelligence is out of the range of our scientific knowledge yet.) Analogously we also have a spectrum of experiments to lay an empirial base for the investigation of mind. This reaches from experiments with natural subjects (human or animals) in the psychological tradition to computer systems in the simulation mode. (From now on I will neglect the application-oriented parts of Al and therefore the application mode of Al-systems too.) Later on in the present paper I will sketch a third type of evidences, which could be called theoretical evidences . Such evidences are only possible by virtue of strict formalisms. And at this point, two older relatives of the twins Cognitive Psychology and Artificial Intelligence, namely Mathematics and Logics, will enter the stage. Following the information processing paradigm of the human mind it is necessary to postulate system-internal models of real or fictional worlds. These internal models are built up by (formal) knoyyledge entities on a semantic or intentional level. (On this level cp. Pylyshyn 1984; p. 210-21 1). Beyond these representations, procedures {procedure is used here in a non-technical sense) are needed to work on the representations. By this the subject area in Representation of KnovyJedge can be outlined: the investigation of these representations and the operations on them to change the internal models. 2. Inferences In the present section I will give ennphasis to the most relevant (from my point of view) concept in the area of knowledge representation: inference. The concept of inference is one of the global concepts (see above) used for the explanation of general cognitive processes. The notion of inference can be found all over Cognitive Science but without a unique and well-established definition. There exist mainly two types of characterization; the logical characterization: THE RULES OF INFERENCE amount to directions as to how sentences already known as true may be transformed so as to yield new true sentences." (Tarski, 1965; p. 47) the psychological characterization: The process of a conclusion, or a conclusion reached, on the basis of previously made or accepted judgements." (Drever, 1964; p. 136.) These roots lead to a converged A I characterization (or one of forms! Cognitive Science ): "Inferences are well-defined changes of attitudes to knowledge entities" At this point one important distinction has to be called to notice, namely that between the concepts rule of inference and inference . The former refers to an instruction or advice how to move from a set of attitudes to some knowledge entities to another set of attitudes to the same or other knowledge attitudes. This means, that rules of inference can be seen as inference-tickets . which license the change of attitudes. (The notion of inference-ticket is due to G.Ryle (1949 p. 1 17).) The latter concept (of inference) refers to the process or action of transforming attitudes itself. Taking this distinction into consideration is very important for Cognitive Science. Rules of inference are knowledge entities of a specific type, i.e. they are part of the system's (human's) knowledge base. In contrast to this, inferences are periormed by the system during information processing. The former have to do with the system's potential competence or capacities the latter with it's actual performance. At this point a relevant pair of questions arises: Do humans use valid, i.e. formally justified, inference rules? Are human's inferences valid? If the answer would be "Yes!" all would be very nice, that means, no problems would appear. But a lot of experiments show that humans are bad in doing valid inferences; cp. Johnson-Laird's (1983) chapter "How to 4 reason syllogistically". Since the answer has to be "No!", In Cognitive Science we have two possible reactions with respect to this fact, which I want to name as 'conservative" and "liberal": conservative reaction: Formal systems should not reproduce human"s mistakes. liberal reaction: Formal systems should describe and explain the inference systems of humans, although they are not valid from a logical point of view. Because the conservative opinion is - implicitly or explicitly - prevalent in the disciplines of logic and formal semantics nearly no cognitively relevant contribution to the topic of inference processes have come from these fields. On the other hand, most cognitive psychologists, cp. Johnson-Laird (1983), agree with the liberal view, but formal inference- systems describing and explaining the human competence and performance are rare up to now. To develop such systems is one main topic of Cognitive Science in the future. With respect to language processing this insight is the base of the important remarks on Semdntic iniujiion by the logicians and semanti cists Barwise and Cooper (1981): "While it Is seldom made explicit, it is sometimes assumed that there is some system of axioms and rules of logic engraved on stone tablets - that on inference in natural language is valid only if it can be formalized by means of these axioms and rules. In actuality, the situation is quite the reverse. The native speaker's judgement as to whether a certain inference Is correct, whether the truth of the hypothesis Implies the truth of the conclusion, is the primary evidence of a semantic theory ..." (Barwise/Cooper 1 98 1 ; p. 20 1 -202) To sum up the situation: In Cognitive Science and its related disciplines different concepts exist of an infereniid} system dependent on the main research topics of the discipline In question. Needed is a unified treatment of inferences and reasoning processes (cp. Section 4 of this paper). 5. The basic structure of inferential systems The mostly investigated and best understood Inferential systems are the calculi of formal logic. I will use the structure of these systems to exemplify and summarize the basic ideas and to introduce the notation, which is used later on. Following traditional logics, eg. Carnap (1939) or Tarski (1965), a calculus with respect to a formal language L is defined by and A, a set of sentences, the axioms R, a set of rules (of inference). Based on this, the concepts of dehvdiions and proofs are defined in the wellknown way. The set of all sentences which are derivable fronn A by means of rules from R is named as set of theorems: Th ( A, R ) := Th R ( A) The natural extension of this idea to applied formal systems, i.e. formal theories, leads to the deductive method ; cp. Camap (1939), Tarski (1965), Kalish/Montague (1964). The language L has to be enriched by nonloglcal constants leading to a language L'. Axioms and inference rules are defined over L', especially nonlogical axioms are used. By this deductive method wide ranges of mathematics and the sciences can be treated in a strictly formal manner. Let us now continue with a psychologies! interpretation - with respect to IPSs - of formal systems. A concerns the factual knowledge of the IPS which the ■^i-ci^, : : system contains in an explicit way (see below), R contains the IPS's rules of inference, which determine the system's inferential capacity, Th r( A) refers to the implicit knowledge of the system, i.e. the knowledge entitles which can be reached by derivations (or inferences). Before I will go on with the explicit - implicit dichotomy I will enter into a sketchy discussion of the question what can be inferred (and by which methods inferences are performed) by formal systems based on standard logics. As first test case the natural quantifier MOST is to be mentioned. On the one hand, humans perform MOST-inferences very well. Therefore there is the need of formal inference-systems to describe and explain the phenomena related with liOST-expressions. On the other hand, there exist no adequate, fully satisfactory treatment of M(BT in any logical approach (cp. Rescher (1962), Kaplan (1966) and Barwise/Cooper (1981) on negative results of nosT-formalizations.) Cushing's (1987) two- predicate-scope MOST-quantifier is - though interesting from a logial point of view - limited w.r.t. explanations of human's hWST-inferences. The second example for beyond-traditional-logic inferences concerns SEEING. As Barwise/Perry (1983) demonstrate a logic of seeing does exist. They postulate an inventory of principles for the behaviour of perceptual reports. As an example I will only mention their Principle of Veridicality: If b sees 9, then ip. (Barwise/Perry 1983; p. 181, 187) (Remark: This case concerns so called ' Nl -Percept udl Reports' , where "NT stands for "naked infinitive". In the present paper I will not discuss advantages and disadvantages of Barwise and Perry's proposal, which is under a controversial discussion in linguistics; cp. Higginbothann (1983). For my argumentation the type of formalization, not the specific solution, is used.) The principle of veridicality can be transformed into an inference rule, e-g- SEE (b,<j») -> <p . , Using the same type of formalization an analogous rule should be postulated for knowing, namely KNOW(a,f)->^ ../.,., ,. which is a notational variant of Hintikka's (1962) condition (C.K.) (p. 43) in a logic of knowledge. What is the moral of these examples of Khrowifffi and SEEING? As I described in more detail in another paper (Habel, 1983) there exists a tendency for some operators (or concepts) to change their status from beeing non- Jogjcd] to becoming iogicaJ . The development of epistemic logic is possible only by treating k and b (for Kiwwil^ and believing) as logical constants. The trend in future formal systems will be to use more and more today-non-logical concepts, e.g. seeing, as logical ones. This means that new logics, e.g. a logic of perceptual reports has to be developed in a formal manner. (The treatment of SEEING in situation semantics by Barwise/Perry - see above - is just a first step on this way.) I now will come back to the explicit-implicit distinction. Using the interpretation given at the beginning of the present section, namely. ~ axioms ~ explicit knowledge Tli( A ) ~ theorems implicit knowledge the relation between these types of knowledge, symbolized by Af-A*:=ThR(A) Is the relation of derlvsMHtg. In contrast to this, from a psychological (or cognitive) point of view another relation and a third type of knowledge has to be emphasized, namely the octusUg derived knowledge A# produced by and connected via the relation of derivotion . This situation (i.e. relation) can be symbolized by AhA# analogously to the logical relation mentioned above. At this point of argumentation some remarks about my use of the logical notions Is necessary: The "one-step" relation between two sets of formulas F|, F2 which Is Induced by application of one Inference rule with respect to the former set Ft resulting In F2 Is usually named as deduction or derivation and symbolized by "h". Often an attribute "direct" or *1n one step" Is used here. From a logical point of view sequences of direct derivations are as Interesting as one-step derivations. Therefore the notion of derivdiion (In one or more steps) Is Introduced, formally by means of the transitive closure of the relation of direct derivation. (Beside of derivation also proof Is used.) But actual derivations or proofs are not the major topics of Interest for a logician; Instead of the actual derivation the possibility to derive or prove is the topic of investigation. Therefore with respect to proof ihe notion of provaMIity Is interesting. Because of this specific focussing on the possibility of the existence of a derivation, I give emphasis to derivatJIiiy if I describe the logical way of Investigating derivations and inferences. At this point it should be mentioned that the question whether the relation of derivablllty has to be established in a constructive way is intensively discussed in the area of logic and proof theory. The details of this discussion, which belong to the foundation of mathematics, are beyond the subject of the present paper. Questions of the type How many steps of derivation are needed to ...? Which way of derivation is used ...? (Both with respect to a pair consisting of a set of axioms and a sentence) are very seldom Investigated in traditional logic. (A very interesting note on this topic Is Boolos' (1987) paper, which describes a first-order Inference rule not feasible practically but usable in a second-order variant.) 4. On the dynamics of Inferential systems Given an Inferential system I = < A, R > we can investigate Inferential processes with respect to I. Such investigations are concerned with the questions stated at the end of the preceding section. Looking at a natural inferential system, e.g. a human, the dynamical properties of the system are relevant and therefore topics In Cognitive Science. Furthermore, we have to take into consideration the distinction (mentioned above) between a A* - the actually derived knowledge and A * ~ the potentially derivable knowledge This distinction is interesting only from the computational point of view: A * is the topic of formal logic whereas A# is of major interest in CS and Al. A distinction analogous to that between A* and A* is the main topic of Levesque's (1984) "Logic of implicit and explicit belief". The major differences between his and my approach are: Levesque distinguishes explicit and implicit beliefs without dealing with the question of processes which make implicit beliefs explicit. Levesque does not deal with processes which change beliefs, i.e. he does not consider the dynamical properties of knowledge sets (see below). Only with respect to both types of implicit knowledge, namely A# and A*, questions of the type Which knowledge entities are derived from A at a specific point of time? are sensible. Beyond the dynamics with respect to specific inferences, i.e. sequences of actions, from a cognitive point of view there exists a second type of dynamics in inferential systems, namely with respect to the systems. To clarify the two types of dynamics i introduce a further theory-internal entity, a se^ of points of time T={t|} I do not want to develop a "theory of time" here. Instead of this some relevant properties of T list 1. ii. iii. T is an ordered set (ordering <) lis infinite T is discrete These properties of T reflect the use of time in the following: points of time are seen as indices of states of the inferential system. (More elaborated time-theories are developed by and described in Rescher/Urquhart (1971) and van Benthem (1983).) Furthermore, I claim the existence of a minimal element tg, which corresponds to initial state. Thus it is possible to use the natural numbers, as a canonical index set; cp. also Habel ( 1 985). An interval T" cT of time points, i.e. sets T : = Itj, tj] = {t/tj < t & t < tj} will be named as time-span. \\ and tj are called beginning time, b(T'), and ending time, e(T'), respectively. By means of T there exists a natural way to speak about the "Inference rule x\ used at tj during an Inference process", and to restrict Inference processes with respect to time resources. Given an Inferential system < A, R > and a time-span T' c T it is now sensible to define ThR(A,r):={S/A ItS} similar to the usual definition of Th based on the relation h (derlvability) with one important distinction, namely that sets of iime- resihcied theorems are based on time-restricted derivations characterized as follows: Sifr S2 iff There exists a sequence of Inference rules which constitutes S^ K S2 and there is an index-mapping from F into this sequence. Remark: "time-restricted set of theorems" and "time-restricted derivation" are here not defined in a strict sense; the "definitions" above have to be seen as characterizations. In other words, in the present paper only the idea of a definition is given in form of sketchy remarks. From these characterizations it should be clear, that the cognitively relevant set A# representing derived knowledge is of the type described above, i.e. A* contains the knowledge entities derived during some time-span T. Note, that though A# represents derived knowledge with respect to a specific time-span, nothing Is said here about the reasons for deriving just the knowledge entities of A# - A (the difference of the knowledge sets) during T'. Questions of this type are not topic of the present paper. Let me return to T again. Up to now, we used only one sub-set, namely the time-span T, of T. In the next extension of the theoretical inventory I will make use of specific coverings of T: Let T = < T], ... , T|c> be a sequence of time-spans with e(Tj) = b(T|+i)for1=1,...k-1 and e (T]) = tg , the minimal time-point. T is called an initial covering of T. By means of such Initial coverings it is possible to characterize recursively a sequence of inferential systems and sets of knowledge entities: 10 Given < A, R >, an inferential system, and T = <Ti , ..., 1[^>, an initial covering Ai :=A Aj+i :=ThR(Aj,T|) The dynamical development of derived knowledge can be exemplified graphically as follows Ai "triAi* A 2 1i2 A#2 A3 "trs A#3 Thus we have changed from the investigation of one inferential system (or formal theory) to sequences of inferential systems. From a cognitive point of view it is obvious that the situation in natural inference systems are much more complicated than the sketch (given above) demonstrates. The most important further extension would concern the dynamics of R This means, in contrast to the case described above in which every < A j, R > uses the same set of inferential rules R, we have to consider the development (over time) of Rj too; i.e. interesting (because cognitive realistic) inferential systems have a behaviour of the < Aj, Rj>-type. Steps from Rj to Rj+^ can be seen as induced by learning processes. (Cp. Emde/Habel/Rollinger 1983 , Habel i. prep.) 5. Conclusion: Mutual relation between logics and psLjchology In this concluding section I will summarize the main results and insights which should be influential between the poles "logic' and 'psychology" of the Cognitive Science spectrum. Furthermore, I will formulate some mutual requirements. Logic and "theoretical AT have a lot of results on the formal properties of information processing systems, especially inferential systems. These results concern e.g. problems of decidability, generative capacity and complexity. From a cognitive point of view there are some desiderata, e.g. with respect to the dynamics of inferential processes and systems. Furthermore, traditionally only some phenomena of human knowledge processing are subject of logic. For example, beyond the standard quantifiers "all" and "some" only few results exist (cp. the remarks on most in section 3). In contrast, psychology offers empirical data and theoretical models with respect to derivations by natural inference systems. The formal analysis of the psychologist's models requires 11 methods from logic and theoretical Al. That formal analysis is useful demonstrates the example of "mosf-inferences. Up to now no fullly adequate formal treatment of "most" does exist. The most promising approaches can be classified into two types of cdrdfnaJiiy approaches vs. default-approaches .'[Xk^ former class contains the formalizations of Rescher ( 1 962), Kaplan ( 1 964), Barwise/Cooper (1 98 1 ), the latter Reiter (1980). Both types of formalizations have been investigated with respect to formal properties, e.g. decidability or existence of proof-procedures. Such formal results concern the topic of theoretical evidences mentioned in section 1: e.g. lower "theoretical cost" of one formal system w.r.t. another could possibly correspond to lower cognitive costs of the natural inference systems in question. And by this could be explained why the cheaper system is selected in spite of missing validity (in a formal sense). (The usefulness of theoretical evidences for Cognitive Science will be the topic of a future paper.) References: ■•■ Ti?!:' Barwise, John/Cooper, Robin (1981): Generalized Quantifiers and Natural Language. Linguistics and Philosophy 4 (2). 159-219. Barwise, John/Perry, J. ( 1 983): Situations and Attitudes. Bradford Books: MIT Cambridge, Mass. vanBenthem, J.F. (1983): The Logic of Time. Reidel: Dordrecht Boolos, George (1987): A Cun'ous inference . Journo^ of Philological Logic 16. 1-12. Camap, Rudolf (1939): Foundation of Logic and Mathematics. Encyclopedia of Unified Science, Vol I, No 3: Chicago Churchland, Paul M. (1984): Matter and Consciousness. MIT-Press: Cambridge, Mass. Cushing, Steven (1987): Some Quantifiers Require Two-Predicate Scopes . Arti f i ci al Intelligence 32, 259-67 Drever, James (rev. ed. 1964): The Penguin Dictionary of Psychology. Penguin: Harmondsworth. Emde, Werner/ Habel, Christopher/ Rollinger, Claus-Rainer(1983): The discovery of the equator or concept driven learning. Proc. 8th IJCAI. 455-458. 12 Habel, Christopher (1983): Logjsche Sy si erne und Represent at ionsprot I erne in: B. Neumann (ed): GWAI-83. Springer: Berlin. 1 1 8-42 Habel, Christopher (1985): Referential Nets 8S Knowledge Structures . in: Th. Ballmer (ed): Linguistic Dynamics, de Gruyter: Berlin, p. 62-84 Habel, Christopher (i. prep.); A note on the dynamics of rule systems. Haugeland, John (1978): The Nature and Plausitility of Cogniijvism . in: J. Haugeland (ed): Mind Design. Bradford: Montgomary Vt. 243-81 Higginbotham, James ( 1 983): The logic of perceptual reports: An exiensional alternative to Situation Semantics Journal of Philosophy 80(2). 100-27 Hintikka,Jaakko(1962): Knowledge and Belief. Cornell Univ. Press: Ithaka, NY. Johnson-Laird, P.N. (1983): Mental Models. Cambridge UP: Cambridge. Kalish, Donald /Montague, Richard (1964): Logic-Techniques of formal reasoning. Harcourt, Brace & World: New York. Kaplan, David (1966): Rescher's piuraiitg Quantification .Jo\in\ti\ of Symbolic Logic 31. 153-4 Levesque, Hector J. (1984): A Logic of implicit and Explicit Beiief. k^M 1984: 198-202 Newell, Allen (1980): Physical Symtoi Systems . Cognitive Science 4: 135-83 Pylyshyn, Zenon (1984): Computation and Cognition. Bradford MIT-Press: Cambridge, Mass. Reiter, R. (1980): A I ogic for Default Reasoning. Arti f i ci al I ntel 1 i gence 13: 8 1 - 1 32. Rescher, Nicholas (1962): Plurality Quantification. Journal of Symbolic Logic 27: p. 373-4. Rescher, Nicholas/Urquhart, Alasdair (1971): Temporal Logic. Springer: Wien. 13 Ryle, Gilbert (1949/1973): The Concept of Mind. Penguin University Books: Harmondsworth, Middlesex. Tarski, Alfred (1965): Introduction to Logic. Oxford UP: New York Author's Note: The author's address is: Christopher Habel, Universitat Hamburg, Fachbereich Infomrjatik, Bodenstedtstr. 16, D-2000 Hamburg 50, Fed. Rep. Germany. An earlier version of this paper was presented at a symposium on Knowledge Representation at the 35th Annual Meeting of the German Psychological Society (Deutsche Gesellschaft fur Psychologie) in September 1986.